Mapping of Sequence Reads to the Reference Genomes    ◾    71

In Section 2.1, we showed how to download the FASTA file of the reference genome

sequence of an organism and how to index it using “samtools faidx”. So, if you did not do

that, follow the steps in that section to download the human reference and then to index

it. The sequences of reference genomes can also be downloaded from other databases such

as UCSC database. We have also downloaded and compressed example paired-end FASTQ

files for practice. The next step is to show you how to use an aligner (BWA, Bowtie, and

STAR) for read mapping.

2.3.2.1  Burrows–Wheeler Aligner

The Burrows–Wheeler Aligner (BWA) is a sequence aligner that uses BWT and FM-index.

We can install the latest version of the BWA software by following the installation instruc-

tions at “https://github.com/lh3/bwa” or we can use the following commands:

git clone https://github.com/lh3/bwa.git

cd bwa; make

The above will clone the BWA source files into your working directory and then it will

compile it. Once BWA has been installed successfully, you may need to set its path so that

FIGURE 2.17  The per base quality reports for the reads in the two FASTQ files.